Clustering Variable Length Sequences by Eigenvector Decomposition Using HMM
نویسنده
چکیده
We present a novel clustering method using HMM parameter space and eigenvector decomposition. Unlike the existing methods, our algorithm can cluster both constant and variable length sequences without requiring normalization of data. We show that the number of clusters governs the number of eigenvectors used to span the feature similarity space. We are thus able to automatically compute the optimal number of clusters. We successfully show that the proposed method accurately clusters variable length sequences for various scenarios.
منابع مشابه
Malware Detection using Classification of Variable-Length Sequences
In this paper, a novel method based on the graph is proposed to classify the sequence of variable length as feature extraction. The proposed method overcomes the problems of the traditional graph with variable length of data, without fixing length of sequences, by determining the most frequent instructions and insertion the rest of instructions on the set of “other”, save speed and memory. Acco...
متن کاملAcoustic modeling based on model structure annealing for speech recognition
This paper proposes an HMM training technique using multiple phonetic decision trees and evaluates it in speech recognition. In the use of context dependent models, the decision tree based context clustering is applied to find a parameter tying structure. However, the clustering is usually performed based on statistics of HMM state sequences which are obtained by unreliable models without conte...
متن کاملClustering with Hidden Markov Model on Variable Blocks
Large-scale data containing multiple important rare clusters, even at moderately high dimensions, pose challenges for existing clustering methods. To address this issue, we propose a new mixture model called Hidden Markov Model on Variable Blocks (HMM-VB) and a new mode search algorithm called Modal Baum-Welch (MBW) for mode-association clustering. HMM-VB leverages prior information about chain...
متن کاملTrajectory Pattern Detection by HMM Parameter Space Features and Eigenvector Clustering
We develop an object trajectory pattern learning method that has two significant advantages over past work. First, we represent trajectories in the HMM parameter space which overcomes the trajectory sampling problems of the existing methods. The proposed features are more expressive and enable detection of trajectory patterns that cannot be detected with the conventional trajectory representati...
متن کاملContinuous Word Recognition Based onthe Stochastic Segment Model
This paper presents an overview of the Boston University continuous word recognition system, which is based on the Stochastic Segment Model (SSM). The key components of the system described here include: a segment-based acoustic model that uses a family of Gaussian distributions to characterize variable length segments; a divisive clustering technique for estimating robust context-dependent mod...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004